In this post I want to cover spreading your SQL Server wings with serverless SQL Pools. As part of a series of posts about spreading your SQL Server wings with the Microsoft Intelligent Data Platform.
Since I covered dedicated SQL Pools in my last post I thought I it was only fair that I covered serverless SQL Pools. So that I can cover things in more detail, including CI/CD options.
By the end of this post, you will have a better understanding of serverless SQL Pools. In addition, where your SQL Server background can prove to be useful. Along the way I also provide plenty of links.
You can find out more about this series of posts in an introductory post I published about the series.
About serverless SQL Pools
You can think of serverless SQL Pools as serverless compute that can be spun-up on demand to run T-SQL syntax within Azure Synapse Analytics. Which can be useful in a variety of situations.
For example, serverless SQL Pools can be used to perform ad hoc queries against data stored in files within storage services.
Another common use of serverless SQL Pools is to create a logical data warehouse. For instance, you can use serverless SQL Pools to create a relational layer on top of data that is stored elsewhere as part of a Data Lakehouse.
Doing this allows T-SQL queries can be ran against external tables that actually point to the data in storage behind the scenes. To help with some jargon here, external tables are virtual tables which point to data stored elsewhere.
One you have created your relational layer, applications such as Power BI to query the logical data warehouse. By connecting to the serverless SQL endpoint that gets deployed with Azure Synapse Analytics.
Whereas dedicated SQL Pools uses Massively Parallel Processing (MPP) to work with data, Synapse serverless SQL Pools utilizes a distributed query processing engine. To cater for its serverless and scaling requirements.
To find out more about its architecture, I recommend that you read the Microsoft guide about the Synapse SQL architecture components. Since it contains a diagram that helps you visualize how the architecture for serverless SQL Pools differs from dedicated SQL Pools.
Working with serverless SQL pools
You can work with serverless SQL Pools in various places within Synapse Studio. For example, you can work with SQL scripts in the Develop hub.
In addition, you can connect to them using applications you tend to use for working with SQL Server. Including SQL Server Management Studio and Azure Data Studio. As long as you have the correct networking and authorization configured.
Within these applications you can connect to the serverless SQL Pool by specifying the serverless SQL endpoint instead of a SQL Server name.
SQL Server background
Due to its requirements and infrastructure, serverless SQL Pools supports a lot less T-SQL syntax than dedicated SQL Pools.
Which is fine if you want to use it as a relational layer. Like in the below example, which shows how you can create an external table.
-- First state the file format
IF NOT EXISTS (SELECT * FROM sys.external_file_formats WHERE name = 'SynapseParquetFormat')
CREATE EXTERNAL FILE FORMAT [SynapseParquetFormat]
WITH ( FORMAT_TYPE = PARQUET)
GO
--Then the data source location
IF NOT EXISTS (SELECT * FROM sys.external_data_sources WHERE name = '{REDACTED}_dfs_core_windows_net')
CREATE EXTERNAL DATA SOURCE [{REDACTED}_dfs_core_windows_net]
WITH (
LOCATION = 'abfss://{REDACTED}.dfs.core.windows.net'
)
GO
-- Finally create the external table
-- Which should look familiar to those who have used Polybase in SQL Server
CREATE EXTERNAL TABLE dbo.DemoDeltaTable (
[Name] nvarchar(4000),
[Present] nvarchar(4000)
)
WITH (
LOCATION = 'UnitedKingdom/Delta/Nice/**',
DATA_SOURCE = [{REDACTED}_dfs_core_windows_net],
FILE_FORMAT = [SynapseParquetFormat]
)
GO
--Afterwards you can then query the table directly
SELECT TOP 100 * FROM dbo.DemoDeltaTable
GO
You can create more advanced views on top of these tables. However, be aware that you cannot create materialized views at this moment in time.
You can even work with statistics in serverless SQL Pools. Whenever people ask me for more details about this I always point them to the Synapse Expresso video about the importance of statistics in serverless SQL Pools.
Anyway, as you can see from the above example, your T-SQL skills can prove to be very useful. Plus, your SQL Server background can come in handy with a lot of other aspects relating to serverless SQL Pools as well. Such as working with authentication.
You can find out more about the T-SQL capabilities in serverless SQL Pools within the Transact-SQL features supported in Azure Synapse SQL documentation online.
CI/CD for serverless SQL Pools
Up until February of this year there were a limited number of ways to perform CI/CD for serverless SQL Pools. Most of which were based on migration-based deployments.
In fact, I showed one way you can perform state-based deployments in the November 2022 edition of the Azure Synapse Analytics and Microsoft MVP series.
In the video I showed how you can perform CI/CD in both Azure DevOps and GitHub by using the DBUp .NET Framework along with the dbops PowerShell module.
These limitations were mostly due to the fact that you could not perform state-based migrations using dacpac files. Like you can with dedicated SQL Pools.
However, as of February of this year the options to perform CI/CD for serverless SQL Pools got a little bit more interesting. Due to the fact that a new version of SqlPackage was released which supports the ability to create and deploy dacpacs for serverless SQL Pools.
You can read more about that in my post about deploying a dacpac to a serverless SQL pool.
Final words
I hope this post about spreading your SQL Server wings with serverless SQL Pools has proved to be useful.
Because I wanted to show just how much your SQL Server background can prove to be useful when looking to work with them. Even when you look to perform CI/CD.
Of course, if you have any comments or queries about this post feel free to reach out to me.
[…] To clarify, when I say serverless SQL Pools I mean the ones that are part of Azure Synapse Analytics. Which I covered in a previous post. […]